-
Notifications
You must be signed in to change notification settings - Fork 655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX-#5702: Fix passing RangeIndex to loc. #5719
FIX-#5702: Fix passing RangeIndex to loc. #5719
Conversation
Signed-off-by: mvashishtha <mahesh@ponder.io>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not actually a full fix, see this:
import pandas
import modin.pandas as pd
df = pd.DataFrame({'a': range(20)})
pdf = pandas.DataFrame({'a': range(20)})
print(len(df.loc[range(10)])) # shows "11"
print(len(pdf.loc[range(10)])) # shows "10"
if isinstance(axis_loc, pandas.RangeIndex): | ||
axis_lookup = axis_loc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is the proper fix... I mean, this if
should certainly be there as a fastpath solution! But...
For me it feels like there's a bug somewhere else that this fastpath change just masks.
And I think I know where that is, just a few lines below! Note how axis_labels.slice_indexer()
expects that its both start
and stop
arguments are labels which means that they signify closed interval, while RangeIndex
/range
signify semi-open interval!
Signed-off-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
@mvashishtha I've pushed 0c8d268 which should fix both the original issue and issue with passing |
History digging showed that this was introduced in #3694, cc @dchigarev Also we're using |
it seems that only the >>> idx.slice_indexer(start="a", end="c", step=None)
slice(0, 3, None)
>>> idx.slice_indexer(start="a", end="c", step=None, kind=None)
<stdin>:1: FutureWarning: 'kind' argument in slice_indexer is deprecated and will be removed in a future version. Do not pass it.
slice(0, 3, None) |
@vnlitvinov with your commit, import modin.pandas as pd
df = pd.DataFrame([1, 2, 3])
print(df.loc[:2])
print(df._to_pandas().loc[:2]) seems the bounds have to be different for |
Signed-off-by: mvashishtha <mahesh@ponder.io>
@vnlitvinov I added test cases for |
Maybe we should special-case |
Signed-off-by: mvashishtha <mahesh@ponder.io>
Treating it as 0 seems to work. Done. |
@vnlitvinov @modin-project/modin-core can anyone reproduce the test failure here? https://github.com/modin-project/modin/actions/runs/4307371046/jobs/7512428160
|
never mind! I needed |
Signed-off-by: mvashishtha <mahesh@ponder.io>
Co-authored-by: Vasily Litvinov <fam1ly.n4me@yandex.ru>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
This reverts commit 55bbb7b. @vnlitvinov pointed out "you've checked .stop, not .step for being None"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
What do these changes do?
FIX-#5702: Fix passing RangeIndex to loc.
flake8 modin/ asv_bench/benchmarks scripts/doc_checker.py
black --check modin/ asv_bench/benchmarks scripts/doc_checker.py
git commit -s
docs/development/architecture.rst
is up-to-date